Training Neural Networks for Protein Secondary Structure Prediction: The Effects of Imbalanced Data Set

نویسندگان

  • Viviane Palodeto
  • Hernán Terenzi
  • Jefferson Luiz Brum Marques
چکیده

Protein secondary structure prediction (PSSP) is one of the main tasks in computational biology. During the last few decades, much effort has been made towards solving this problem, with various approaches, mainly artificial neural networks (ANN). Generally, in order to predict the protein secondary structure, the ANN training process is performed using CB513 data set. Like protein structures databases, this data set is imbalanced and it can cause a low error rate for the majority class and an undesirable error rate for the minority class. In this paper we evaluate the effects of an imbalanced data set in training and learning of neural networks when they are applied to predict protein secondary structure. For this we applied resampling methods to tackle the imbalance class problem. Results show that imbalanced data sets decrease the helixes predictions rates. Although, protein data set distribution does not affect significantly the global accuracy (Q3).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Protein Secondary Structure Prediction: a Literature Review with Focus on Machine Learning Approaches

DNA sequence, containing all genetic traits is not a functional entity. Instead, it transfers to protein sequences by transcription and translation processes. This protein sequence takes on a 3D structure later, which is a functional unit and can manage biological interactions using the information encoded in DNA. Every life process one can figure is undertaken by proteins with specific functio...

متن کامل

The Prediction of Surface Tension of Ternary Mixtures at Different Temperatures Using Artificial Neural Networks

In this work, artificial neural network (ANN) has been employed to propose a practical model for predicting the surface tension of multi-component mixtures. In order to develop a reliable model based on the ANN, a comprehensive experimental data set including 15 ternary liquid mixtures at different temperatures was employed. These systems consist of 777 data points generally containing hydrocar...

متن کامل

Nyquist Plots Prediction Using Neural Networks in Corrosion Inhibition of Steel by Schiff Base

The corrosion inhibition effect of N,N′-bis(n-Hydroxybenzaldehyde)-1,3-Propandiimine on mild steel has been investigated in 1 M HCl using electrochemical impedance spectroscopy. A predictive model was presented for Nyquist plots using an artificial neural network. The proposed model predicted the imaginary impedance based on the real part of the impedance as a function of time. The model to...

متن کامل

PREDICTION OF COMPRESSIVE STRENGTH AND DURABILITY OF HIGH PERFORMANCE CONCRETE BY ARTIFICIAL NEURAL NETWORKS

Neural networks have recently been widely used to model some of the human activities in many areas of civil engineering applications. In the present paper, artificial neural networks (ANN) for predicting compressive strength of cubes and durability of concrete containing metakaolin with fly ash and silica fume with fly ash are developed at the age of 3, 7, 28, 56 and 90 days. For building these...

متن کامل

Multi-Step-Ahead Prediction of Stock Price Using a New Architecture of Neural Networks

Modelling and forecasting Stock market is a challenging task for economists and engineers since it has a dynamic structure and nonlinear characteristic. This nonlinearity affects the efficiency of the price characteristics. Using an Artificial Neural Network (ANN) is a proper way to model this nonlinearity and it has been used successfully in one-step-ahead and multi-step-ahead prediction of di...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009